    1 2 3 hive> create database test; OK Time taken: 2.606 seconds



    1 2 3 4 5 6 7 8 9 10 11 [[email protected] testHivePara]$ hive -f student.sql Hive history file=/tmp/ OK Time taken: 2.131 seconds OK Time taken: 0.878 seconds Copying data from file:/home/users/czt/testdata_student Copying file: file:/home/users/czt/testdata_student Loading data to table test.student OK Time taken: 1.76 seconds



    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 use test;   ---学生信息表 create table IF NOT EXISTS student( sno bigint comment '学号' , sname string comment '姓名' , sage bigint comment '年龄' , pdate string comment '入学日期' ) COMMENT '学生信息表' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;   LOAD DATA LOCAL INPATH '/home/users/czt/testdata_student' INTO TABLE student;



    1 2 3 4 5 6 7 8 9 10 11 12 13 1 name1 21 20130901 2 name2 22 20130901 3 name3 23 20130901 4 name4 24 20130901 5 name5 25 20130902 6 name6 26 20130902 7 name7 27 20130902 8 name8 28 20130902 9 name9 29 20130903 10 name10 30 20130903 11 name11 31 20130903 12 name12 32 20130904 13 name13 33 20130904


方法1:shell中设置变量,hive -e中直接使用


    1 2 3 4 5 #!/bin/bash tablename="student" limitcount="8"   hive -S -e "use test; select * from ${tablename} limit ${limitcount};"



    1 2 3 4 5 6 7 8 9 10 11 12 [[email protected] testHivePara]$ sh -x + tablename=student + limitcount=8 + hive -S -e 'use test; select * from student limit 8;' 1       name1    21      20130901 2       name2    22      20130901 3       name3    23      20130901 4       name4    24      20130901 5       name5    25      20130902 6       name6    26      20130902 7       name7    27      20130902 8       name8    28      20130902


由于hive自身是类SQL语言,缺乏shell的灵活性和对过程的控制能力,所以采用shell+hive的开发模式非常常见,在shell中直接定义变量,在hive -e语句中就可以直接引用;

注意:使用-hiveconf定义,在hive -e中是不能使用的


    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #!/bin/bash tablename="student" limitcount="8"   hive -S \     -hiveconf enter_school_date="20130902" \     -hiveconf min_age="26" \     -e \     "    use test; \         select * from ${tablename} \         where \             pdate='${hiveconf:enter_school_date}' \             and \             sage>'${hiveconf:min_age}' \         limit ${limitcount};"




    1 + hive -S -hiveconf enter_school_date=20130902 -hiveconf min_age=26 -e 'use test; explain select * from student where pdate='\'''\'' and sage>'\'''\'' limit 8;'



因为换行什么的很不方便,hive -e只适合写少量的SQL代码,所以一般都会写很多hql文件,然后使用hive –f的方法来调用,这时候可以通过-hiveconf定义一些变量,然后在SQL中直接使用。


    1 2 3 #!/bin/bash   hive -hiveconf enter_school_date="20130902" -hiveconf min_ag="26" -f testvar.sql


    1 2 3 4 5 6 7 8 use test;   select * from student where pdate='${hiveconf:enter_school_date}' and sage > '${hiveconf:min_ag}' limit 8;


    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [[email protected] testHivePara]$ sh -x + hive -hiveconf enter_school_date=20130902 -hiveconf min_ag=26 -f testvar.sql Hive history file=/tmp/czt/hive_job_log_czt_201309131651_2035045625.txt OK Time taken: 2.143 seconds Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Kill Command = hadoop job -kill job_20130911213659_42303 2013-09-13 16:52:00,300 Stage-1 map = 0%,  reduce = 0% 2013-09-13 16:52:14,609 Stage-1 map = 28%,  reduce = 0% 2013-09-13 16:52:24,642 Stage-1 map = 71%,  reduce = 0% 2013-09-13 16:52:34,639 Stage-1 map = 98%,  reduce = 0% Ended Job = job_20130911213659_42303 OK 7       name7   27      20130902 8       name8   28      20130902 Time taken: 54.268 seconds







